AITopics | jnull 2

Collaborating Authors

jnull 2

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimal Estimation in Mixed-Membership Stochastic Block Models

Noskov, Fedor, Panov, Maxim

arXiv.org Artificial IntelligenceJul-26-2023

Community detection is one of the most critical problems in modern network science. Its applications can be found in various fields, from protein modeling to social network analysis. Recently, many papers appeared studying the problem of overlapping community detection, where each node of a network may belong to several communities. In this work, we consider Mixed-Membership Stochastic Block Model (MMSB) first proposed by Airoldi et al. (2008). MMSB provides quite a general setting for modeling overlapping community structure in graphs. The central question of this paper is to reconstruct relations between communities given an observed network. We compare different approaches and establish the minimax lower bound on the estimation error. Then, we propose a new estimator that matches this lower bound. Theoretical results are proved under fairly general conditions on the considered model. Finally, we illustrate the theory in a series of experiments.

data mining, ew 2, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2307.1453

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > Russia (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Overparameterized ReLU Neural Networks Learn the Simplest Models: Neural Isometry and Exact Recovery

Wang, Yifei, Hua, Yixuan, Candés, Emmanuel, Pilanci, Mert

arXiv.org Artificial IntelligenceFeb-17-2023

The practice of deep learning has shown that neural networks generalize remarkably well even with an extreme number of learned parameters. This appears to contradict traditional statistical wisdom, in which a trade-off between model complexity and fit to the data is essential. We aim to address this discrepancy by adopting a convex optimization and sparse recovery perspective. We consider the training and generalization properties of two-layer ReLU networks with standard weight decay regularization. Under certain regularity assumptions on the data, we show that ReLU networks with an arbitrary number of parameters learn only simple models that explain the data. This is analogous to the recovery of the sparsest linear model in compressed sensing. For ReLU networks and their variants with skip connections or normalization layers, we present isometry conditions that ensure the exact recovery of planted neurons. For randomly generated data, we show the existence of a phase transition in recovering planted neural network models, which is easy to describe: whenever the ratio between the number of samples and the dimension exceeds a numerical threshold, the recovery succeeds with high probability; otherwise, it fails with high probability. Surprisingly, ReLU networks learn simple and sparse models that generalize well even when the labels are noisy . The phase transition phenomenon is confirmed through numerical experiments.

artificial intelligence, machine learning, optimal solution, (19 more...)

arXiv.org Artificial Intelligence

2209.15265

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

One-Way Matching of Datasets with Low Rank Signals

Chen, Shuxiao, Jiang, Sizun, Ma, Zongming, Nolan, Garry P., Zhu, Bokai

arXiv.org Artificial IntelligenceOct-3-2022

A major motivation of the present work is the prevalence of data matching in analyzing single-cell multi-omics data . In single-cell biology research, it is routine to compile datasets obtained in different batches but with similar measurement protocols or under similar experiment conditions. When handling such datasets, matching similar cells in different datasets is often a critical step for the correction of technical variations and batch effects [39]. As another common practice, cell biologists routinely integrate datasets with (partially) overlapping biological (e.g., transcriptomic and proteomic) information collected from different experiment conditions, profiling technologies, tissues, or species (e.g., [38, 41, 23]) to better understand and define cell states. To achieve such goals, it is necessary to (identify and) align cells in comparable states across related datasets.

artificial intelligence, jnull 2, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2204.13858

Country:

North America > United States > Pennsylvania (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Vision (0.67)

Add feedback

Adaptive Estimation and Uniform Confidence Bands for Nonparametric IV

Chen, Xiaohong, Christensen, Timothy, Kankanala, Sid

arXiv.org Machine LearningJul-25-2021

We introduce computationally simple, data-driven procedures for estimation and inference on a structural function $h_0$ and its derivatives in nonparametric models using instrumental variables. Our first procedure is a bootstrap-based, data-driven choice of sieve dimension for sieve nonparametric instrumental variables (NPIV) estimators. When implemented with this data-driven choice, sieve NPIV estimators of $h_0$ and its derivatives are adaptive: they converge at the best possible (i.e., minimax) sup-norm rate, without having to know the smoothness of $h_0$, degree of endogeneity of the regressors, or instrument strength. Our second procedure is a data-driven approach for constructing honest and adaptive uniform confidence bands (UCBs) for $h_0$ and its derivatives. Our data-driven UCBs guarantee coverage for $h_0$ and its derivatives uniformly over a generic class of data-generating processes (honesty) and contract at, or within a logarithmic factor of, the minimax sup-norm rate (adaptivity). As such, our data-driven UCBs deliver asymptotic efficiency gains relative to UCBs constructed via the usual approach of undersmoothing. In addition, both our procedures apply to nonparametric regression as a special case. We use our procedures to estimate and perform inference on a nonparametric gravity equation for the intensive margin of firm exports and find evidence against common parameterizations of the distribution of unobserved firm productivity.

procedure, theorem 4, ucb, (14 more...)

arXiv.org Machine Learning

2107.11869

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York (0.04)
Asia > Singapore (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Convergence of Graph Laplacian with kNN Self-tuned Kernels

Cheng, Xiuyuan, Wu, Hau-Tieng

arXiv.org Machine LearningNov-2-2020

Kernelized Gram matrix $W$ constructed from data points $\{x_i\}_{i=1}^N$ as $W_{ij}= k_0( \frac{ \| x_i - x_j \|^2} {\sigma^2} ) $ is widely used in graph-based geometric data analysis and unsupervised learning. An important question is how to choose the kernel bandwidth $\sigma$, and a common practice called self-tuned kernel adaptively sets a $\sigma_i$ at each point $x_i$ by the $k$-nearest neighbor (kNN) distance. When $x_i$'s are sampled from a $d$-dimensional manifold embedded in a possibly high-dimensional space, unlike with fixed-bandwidth kernels, theoretical results of graph Laplacian convergence with self-tuned kernels, however, have been incomplete. This paper proves the convergence of graph Laplacian operator $L_N$ to manifold (weighted-)Laplacian for a new family of kNN self-tuned kernels $W^{(\alpha)}_{ij} = k_0( \frac{ \| x_i - x_j \|^2}{ \epsilon \hat{\rho}(x_i) \hat{\rho}(x_j)})/\hat{\rho}(x_i)^\alpha \hat{\rho}(x_j)^\alpha$, where $\hat{\rho}$ is the estimated bandwidth function {by kNN}, and the limiting operator is also parametrized by $\alpha$. When $\alpha = 1$, the limiting operator is the weighted manifold Laplacian $\Delta_p$. Specifically, we prove the pointwise convergence of $L_N f $ and convergence of the graph Dirichlet form with rates. Our analysis is based on first establishing a $C^0$ consistency for $\hat{\rho}$ which bounds the relative estimation error $|\hat{\rho} - \bar{\rho}|/\bar{\rho}$ uniformly with high probability, where $\bar{\rho} = p^{-1/d}$, and $p$ is the data density function. Our theoretical results reveal the advantage of self-tuned kernel over fixed-bandwidth kernel via smaller variance error in low-density regions. In the algorithm, no prior knowledge of $d$ or data density is needed. The theoretical results are supported by numerical experiments.

artificial intelligence, convergence, machine learning, (18 more...)

arXiv.org Machine Learning

2011.01479

Country:

North America > United States (0.14)
Asia > Middle East > Israel (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Feature Purification: How Adversarial Training Performs Robust Deep Learning

Allen-Zhu, Zeyuan, Li, Yuanzhi

arXiv.org Machine LearningSep-17-2020

Despite the empirical success of using Adversarial Training to defend deep learning models against adversarial perturbations, so far, it still remains rather unclear what the principles are behind the existence of adversarial perturbations, and what adversarial training does to the neural network to remove them. In this paper, we present a principle that we call Feature Purification, where we show one of the causes of the existence of adversarial examples is the accumulation of certain small dense mixtures in the hidden weights during the training process of a neural network; and more importantly, one of the goals of adversarial training is to remove such mixtures to purify hidden weights. We present both experiments on the CIFAR-10 dataset to illustrate this principle, and a theoretical result proving that for certain natural classification tasks, training a two-layer neural network with ReLU activation using randomly initialized gradient descent indeed satisfies this principle. Technically, we give, to the best of our knowledge, the first result proving that the following two can hold simultaneously for training a neural network with ReLU activation. (1) Training over the original data is indeed non-robust to small adversarial perturbations of some radius. (2) Adversarial training, even with an empirical perturbation algorithm such as FGM, can in fact be provably robust against ANY perturbations of the same radius. Finally, we also prove a complexity lower bound, showing that low complexity models such as linear classifiers, low-degree polynomials, or even the neural tangent kernel for this network, CANNOT defend against perturbations of this same radius, no matter what algorithms are used to train them.

artificial intelligence, machine learning, nullw, (17 more...)

arXiv.org Machine Learning

2005.1019

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Gambling (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Entropy Regularized Power k-Means Clustering

Chakraborty, Saptarshi, Paul, Debolina, Das, Swagatam, Xu, Jason

arXiv.org Machine LearningJan-10-2020

Despite its well-known shortcomings, $k$-means remains one of the most widely used approaches to data clustering. Current research continues to tackle its flaws while attempting to preserve its simplicity. Recently, the \textit{power $k$-means} algorithm was proposed to avoid trapping in local minima by annealing through a family of smoother surfaces. However, the approach lacks theoretical justification and fails in high dimensions when many features are irrelevant. This paper addresses these issues by introducing \textit{entropy regularization} to learn feature relevance while annealing. We prove consistency of the proposed approach and derive a scalable majorization-minimization algorithm that enjoys closed-form updates and convergence guarantees. In particular, our method retains the same computational complexity of $k$-means and power $k$-means, but yields significant improvements over both. Its merits are thoroughly assessed on a suite of real and synthetic data experiments.

algorithm, consistency, jnull 2, (17 more...)

arXiv.org Machine Learning

2001.03452

Country:

Asia > Middle East > Jordan (0.04)
Asia > India > West Bengal > Kolkata (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback